Skip to content

MDEV-39520: Add MySQL 8.0 extended syntax for REGEXP_INSTR#5035

Open
MohamedM216 wants to merge 1 commit into
MariaDB:mainfrom
MohamedM216:mysql8-extended-regexp
Open

MDEV-39520: Add MySQL 8.0 extended syntax for REGEXP_INSTR#5035
MohamedM216 wants to merge 1 commit into
MariaDB:mainfrom
MohamedM216:mysql8-extended-regexp

Conversation

@MohamedM216
Copy link
Copy Markdown
Contributor

@MohamedM216 MohamedM216 commented May 3, 2026

Reference Issue

MDEV-39520

Description

This PR is one of a series of PRs that are supposed to add MySQL 8.0-compatible extended syntax for all REGEXP functions

Test

regexp_instr_mysql8.test

@MohamedM216 MohamedM216 force-pushed the mysql8-extended-regexp branch 2 times, most recently from d2ac796 to ef7d499 Compare May 3, 2026 16:29
@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label May 4, 2026
Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! This is a preliminary review.

First of all: this is implementing just one sub-task of what https://jira.mariadb.org/browse/MDEV-39106 calls for. Either do all of it, as scoped in the jira, in a single go. Or create Jira sub-task(s) into the above jira and target these instead.

There's also some suggestions below.

Comment thread sql/item_cmpfunc.cc Outdated
Comment thread sql/item_cmpfunc.h Outdated
@MohamedM216 MohamedM216 changed the title MDEV-39106: Add MySQL 8.0 extended syntax for REGEXP_INSTR MDEV-39520: Add MySQL 8.0 extended syntax for REGEXP_INSTR May 4, 2026
@MohamedM216 MohamedM216 force-pushed the mysql8-extended-regexp branch 2 times, most recently from d0e479a to 2634104 Compare May 4, 2026 17:46
@MohamedM216 MohamedM216 requested a review from gkodinov May 4, 2026 19:07
Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the failing buildbot tests. Some improvement suggestions below too.

Comment thread mysql-test/main/regexp_instr_mysql8.test Outdated
Comment thread sql/item_cmpfunc.cc
Comment thread sql/item_cmpfunc.h Outdated
@MohamedM216 MohamedM216 force-pushed the mysql8-extended-regexp branch from 2634104 to 1518554 Compare May 5, 2026 12:35
@MohamedM216
Copy link
Copy Markdown
Contributor Author

The current failing tests aren't related to my patch and they pass locally.

@MohamedM216 MohamedM216 requested a review from gkodinov May 5, 2026 17:52
Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please stand by for the final review.

Copy link
Copy Markdown
Member

@sanja-byelkin sanja-byelkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found implementation problem should be fixed. Also performance and functionality test should be made after code review done

Comment thread sql/item_cmpfunc.cc Outdated

re.init(cmp_collation.collation, 0);

if (arg_count > 5 && args[5]->const_item())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const_item is not basic constant. prepare place holder (parameter '?') is a constant for example but changes, some other cases also possible if I remember correctly. So tests with changing parameter of prepared statement should be added.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test for this case is added now at the end of the test file.

Comment thread sql/item_cmpfunc.cc Outdated
{
char mt_buf[64];
String mt_tmp(mt_buf, sizeof(mt_buf), &my_charset_latin1);
String *match_type_str= args[5]->val_str(&mt_tmp);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If It was constant and flags was calculated on fix field what sens to repeat here again and again the same exercise?

Comment thread sql/item_cmpfunc.cc Outdated
if ((null_value= args[5]->null_value))
return 0;

if (!args[5]->const_item())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here we are suppose to use old value but above we get string value of the Item what the sens in above evaluation?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also constant flag can change try this with tables of 1 row length and it will fail (please add such test also). i.e. if we cached values on fix_field w should not detect by call of const_item during execution bacouse we have make_const calls, we have substitution and maybe I forgot about something else. So please use something else (maybe just boolean flag)

Copy link
Copy Markdown
Contributor Author

@MohamedM216 MohamedM216 May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also constant flag can change try this with tables of 1 row length and it will fail

You're right, it fails, but using the boolean trick to store const_item()'s value then reuse it fixes this bug.

The test is found at the end of the test file.

Comment thread sql/item_cmpfunc.cc
}

char subject_buf[MAX_FIELD_WIDTH];
String subject_tmp(subject_buf, sizeof(subject_buf), &my_charset_bin);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not specialist in regexp but I feel it can be not so efficient as previous code. So testing of performance should be done. @gkodinov could you organise one. Also it is new feature which rewrite rexexp so after code review it have to go to a tester review as we do with all features.

Extend REGEXP_INSTR to accept the full MySQL 8.0 signature:

  REGEXP_INSTR(subject, pattern
               [, pos [, occurrence [, return_option
               [, match_type]]]])

Previously only 2 arguments were accepted. Changes:

- Switch Create_func_regexp_instr from Create_func_arg2 to
  Create_native_func to allow 2-6 arguments.
- Add parse_match_type_flags() to Regexp_processor_pcre, which parses
  the match_type flags (c/i/m/n/u) and overwrites m_library_flags
  with the fully-resolved PCRE2 flag word.
- Call parse_match_type_flags() in fix_length_and_dec() before fix_owner()
  when match_type is constant, so the pattern is compiled with the
  correct flags. For constant patterns fix_owner() sets m_is_const=true,
  making recompile() a permanent no-op.
- For non-constant match_type, resolve flags per-row in val_int()
  and call compile() directly to bypass the recompile() no-op guard.
- Add MTR test: regexp_instr_mysql8.test
@MohamedM216 MohamedM216 force-pushed the mysql8-extended-regexp branch from 1518554 to 96f67b0 Compare May 14, 2026 20:53
@MohamedM216 MohamedM216 requested a review from sanja-byelkin May 14, 2026 21:06
@MohamedM216
Copy link
Copy Markdown
Contributor Author

Thanks @sanja-byelkin for the detailed feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

3 participants